Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Improve CometSortMergeJoin statistics #304

Merged
merged 2 commits into from
Apr 23, 2024

Conversation

planga82
Copy link
Contributor

Which issue does this PR close?

Closes #303 .

Rationale for this change

Add all statistics SortMergeJoinExec datafusion node provides.

What changes are included in this PR?

All available metrics

  /// Total time for joining probe-side batches to the build-side batches
   join_time: metrics::Time,
   /// Number of batches consumed by this operator
   input_batches: metrics::Count,
   /// Number of rows consumed by this operator
   input_rows: metrics::Count,
   /// Number of batches produced by this operator
   output_batches: metrics::Count,
   /// Number of rows produced by this operator
   output_rows: metrics::Count,   
   /// Peak memory used for buffered data.
   /// Calculated as sum of peak memory values across partitions
   peak_mem_used: metrics::Gauge

image

How are these changes tested?

Unit testing and manual testing

@viirya viirya changed the title Improve CometSortMergeJoin statistics feat: Improve CometSortMergeJoin statistics Apr 23, 2024
Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @planga82. This looks good to me.

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @planga82

@andygrove andygrove merged commit 646f9a0 into apache:main Apr 23, 2024
29 of 30 checks passed
@planga82 planga82 deleted the feature/improve_metrics branch April 23, 2024 15:08
himadripal pushed a commit to himadripal/datafusion-comet that referenced this pull request Sep 7, 2024
* Improve CometSortMegeJoin statistics

* Add tests

---------

Co-authored-by: Pablo Langa <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve CometSortMergeJoin statistics
4 participants